Genome-Wide Analysis of Cytochrome P450s of Alternaria Species: Evolutionary Origin, Family Expansion and Putative Functions

Cytochrome P450s are a group of monooxygenase enzymes involved in primary, secondary and xenobiotic metabolisms. They have a wide application in the agriculture sector where they could serve as a target for herbicides or fungicides, while they could function in the pharmaceutical industry as drugs or drugs structures or for bioconversions. Alternaria species are among the most commonly encountered fungal genera, with most of them living as saprophytes in different habitats, while others are parasites of plants and animals. This study was conducted to elucidate the diversity and abundance, evolutionary relationships and cellular localization of 372 cytochrome P450 in 13 Alternaria species. The 372 CYP proteins were phylogenetically clustered into ten clades. Forty (40) clans and seventy-one (71) cyp families were identified, of which eleven (11) families were found to appear in one species each. The majority of the CYP proteins were located in the endomembrane system. Polyketide synthase (PKS) gene cluster was the predominant secondary metabolic-related gene cluster in all the Alternaria species studied, except in A. porriof, where non-ribosomal peptide synthetase genes were dominant. This study reveals the expansion of cyps in these fungal genera, evident in the family and clan expansions, which is usually associated with the evolution of fungal characteristics, especially their lifestyle either as parasites or saprophytes, with the ability to metabolize a wide spectrum of substrates. This study can be used to understand the biology, physiology and toxigenic potentials of P450 in these fungal genera.


Introduction
Alternaria species are ubiquitous fungi with different life cycles consisting of saprophytic, endophytic and parasitic modes of living [1]. This genus of fungi is characterized by the formation of large conidia that are usually dark, multicellular, with both transverse and longitudinal septa. Alternaria genus is characterized to be in the Family: Pleosporaceae, Order: Pleosporales, Class: Dothediomycetes, Subdivision: Pezizomycotina, Division: Deuteromycotina and Phylum: Ascomycota [2,3]. This genus has been reviewed to consist of about 350 species broadly divided into those with large spores and those with small spores, and which, collectively, have been further subdivided into several sections based on morphological and molecular phylogenetic characteristics [4]. Saprophytic Alternaria species play an important ecological role where they collaborate with other microbes to decompose and mineralize plant residues, thereby aiding the bio-geochemical recycling of nutrients. Many Alternaria species have been characterized as endophytes residing in healthy plants tissues and producing many bioactive compounds that can stimulate the growth of the host plant, suppress pathogens, improve resistance to environmental stress and aid the assimilation of nitrogen [5,6]. Other members of this genus are important plant pathogens with a broad host range reported to cause diseases and many post-harvest diseases in different crops in about 400 plant species causing significant economic loss to important crops, such as tomatoes, potatoes, apples, etc. [1,7]. Some group members can affect weeds that could be processed and applied as mycoherbicides, while others cause upper respiratory tract infection and asthma in humans [1]. Recent findings have identified Alternaria sp. as one of the few fungi capable of degrading untreated extra-heavy crude oil, demonstrating its potential suitability for use in bioremediation [8]. Over 300 secondary metabolites have been described to be produced by the genus Alternaria. These belong to different categories of naturally occurring compounds, such as nitrogen-containing compounds, steroids, terpenoids, pyranones (pyrones), quinones and phenolics. They have different biologic activities, such as phytotoxic, cytotoxic and antimicrobial properties, serving as base structures for pesticides and drugs. The biological activities of active compounds have been validated by numerous pharmacologists, plant pathologists and chemists [4,9,10].
Cytochrome P450 are hemoproteins containing monooxygenases that catalyze the transformation of a wide array of endogenous and exogenous substances. They have a wide range of functional properties, such as catalyzing the regio-, chemo-and stereospecific oxidation of a wide array of substrates, indicating their importance as major players in primary and secondary metabolism and xenobiotic degradation [11,12]. Many secondary metabolites that are important to medical, agricultural and industrial processes are biosynthesized with the use of CYPs, of which fungi are producers of a wide spectrum of secondary metabolites making use of CYPs to serve as biocatalyst, drug and agrochemicals targets and for bioremediation of heavily contaminated environment [13,14]. The physiological traits of fungi have been associated with CYPs, such as the pathogenicity of fungi, and it has been reported that the pathogenicity of fungi is a consequence of the expansion and functional diversification of fungal CYPs. CYPs play a housekeeping role in fungi, especially CYP51, which is used in the biosynthesis of sterol, which is a popular antifungal target in the control of human and plant diseases caused by fungi, and they also impact the ecological roles of fungi serving as saprotrophs or decomposers [14]. Therefore, there is a need for an extensive investigation to study the different aspects of CYPs function, regulation and biotechnological applications due to their wide functional and biological roles [13]. Many studies have attempted to analyze the CYPome of many fungi, such as that of Phanerochete chrysosporium [15], Mycosphaerellagraminicola [16], Grosmannia clavigera [17], Trichoderma spp., [12] and Fusarium spp., [18]. However, this information is scarce in Al-ternaria species and is based on individual species, especially Alternaria alternata [19][20][21]. Elucidating the comparative evolutionary process of cytochrome P450 proteins in different Alternaria species can further enhance their biotechnological exploration. Therefore, this study intends to perform robust profiling of the CYP in 13 species of Alternaria due to the ecological, agricultural, industrial and medical applications, and implications of this important fungal genus.

Sequence Validation
A two-step procedure was performed for the sequence validation using the procedure established by [19]. Firstly, the retrieved protein sequences of each Alternaria spp. were retrieved. Sequences without Cytochrome P450 annotations (as described in the JGI database) were manually removed. Secondly, the conserved domains (CD) of the resultant sequences were further validated in the NCBI batch CD database with the cut-off of positive hits set at E-value 10 −5 [15]. A total of 372 cyp protein sequences (Supplementary Files S2 and S3) from the 13 Alternaria spp. were validated and used for further analyses in the present study.

Annotation of CYPs
The selected fungal cytochrome P450 protein sequence was subjected to blasting on the Fungal Cytochrome P450 database (FCPD) to identify the CYPs families (http://p450 .riceblast.snu.ac.kr/blast.php (accessed on 26 June 2021)) on blast program BLOSUM62 matrix with a limited expected value of 1e-5. The predicted sequences were assigned to CYP families and clans to which they have the highest homology (40% and above) from the fungal Cytochrome P450 database (http://p450.riceblast.snu.ac.kr (accessed on 26 June 2021)) against all named fungal CYPs as followed by the International P450 Nomenclature Committee [13].

Construction of Heatmap
An interactive expression heatmap was constructed to show the distribution of the identified Cyp families in the thirteen Alternaria species. The data were uploaded to http://heatmapper.ca/expression/ (accessed on 22 February 2022), and the following parameters were used for the heatmap plot: clustering method-average linkage, distance measurement method-Euclidean, scale type row. Clustering was also applied to row while custom color scheme was used.

Phylogenetic Reconstruction of CYPs
Using MEGA X software, the selected fungal cytochrome P450 proteins were subjected to sequence alignment using ClustalW for pairwise and multiple sequence alignment with gaps [22]. The maximum likelihood method and JTT matrix-based model [23] were used to infer the evolutionary history using the neighbor join and BioNJ algorithms to a matrix of pairwise distances estimated using the JTT model, and the topology of the tree was evaluated by bootstrap analysis with one thousand re-sampling replicates. The tree was drawn to scale, with branch lengths measured in the number of substitutions per site. The evolutionary analyses were conducted in MEGA X [22], involving 372 protein sequences following the description of [24].

Identification of Cytochrome P450s Associated with Secondary Metabolism-Related Gene Clusters
This was performed using the automated pipeline on the respective genome pages for all the Alternaria species, and the secondary metabolic-related gene clusters included NRPS, PKS, PKS/NRPS, NRPS-like and terpene cyclase clusters.

Subcellular Localization Analysis
BUSCA integrative webserver (https://busca.biocomp.unibo.it (accessed on 26 July 2021)) was used for determining the subcellular localization of the 372 proteins to gain more understanding of the functional mechanism of the Cytochrome P450 proteins [25].

CYP Proteins in Alternaria
The result obtained shows the presence of 372 cytochrome P450 proteins in the 13 Alternaria species in Table 1. It was discovered that A. macrospora had the highest number of CYP protein entries (42). This is followed by A. dauci and A. solani, with 33 and 32 CYP protein entries, respectively. The least number was observed in A. fragaria, which had 23, and A. porri, with 24 CYP proteins. It was also discovered that a total of 34 cytochrome P450 protein entries had no family matches in the fungal cytochrome P450 database from all the 13 Alternaria species, with the majority of this category in A. brassicicola (11). In contrast, all the other species had either one, two or three entries with no family match.

Family and Clan Classification
Heatmap showing the distribution of Cyp families (green) or absence (red) across thirteen (13) Alternaria species ( Figure 1). The data used in generating this heat map are presented in Supplementary Data File S4. A total of 71 Cyp families and 40 CYP clans were identified in the 13 Alternaria species, as shown in Figure 1. A. macrospora had the most diverse Cyp families (30), followed by A. dauci (27), while the species with the least Cyp family diversity was A. brassicicola (12 families). The results in Figure 1 also show that 11 Cyp families were only found in specific Alternaria species. For instance, Cyp532, Cyp526 and Cyp532 were only present in A. macrospora, Cyp5093 and Cyp5095 in A. porri, Cyp545, Cyp5112 and Cyp596 in A. solani, while Cyp665, Cyp521 and Cyp61 were only found in A. gaisen, A. rosae and A. brassicicola, respectively. However, Cyp552 was found in 10 Alternaria species, showing that it is more conserved than the other Cyps. This is closely followed by Cyp5103, which appeared in nine species, while Cyp65, Cyp505 and Cyp530 were found in eight Alternaria species.
Cyp526 and Cyp532 were only present in A. macrospora, Cyp5093 and Cyp5095 in A. porri, Cyp545, Cyp5112 and Cyp596 in A. solani, while Cyp665, Cyp521 and Cyp61 were only found in A. gaisen, A. rosae and A. brassicicola, respectively. However, Cyp552 was found in 10 Alternaria species, showing that it is more conserved than the other Cyps. This is closely followed by Cyp5103, which appeared in nine species, while Cyp65, Cyp505 and Cyp530 were found in eight Alternaria species.

Evolutionary Relationship
Phylogenetic analysis was carried out using the 372 aligned CYP proteins sequences to demonstrate the evolutionary relationships of the CYPs in the 13 Alternaria species, as illustrated in Figure 2. It was discovered that CYPs belonging to the same family, regardless of the Alternaria species, were clustered in the same monophyletic clade on the phylogenetic tree, suggesting a strong evolutionary relationship. The different CYPs in these organisms were discovered to be clustered into ten clades, as shown in Table 2. Clade I had the highest branches with 127 CYP proteins entries. Here, 15 CYP proteins with unidentified families were found to be clustered in this clade, and Cyp5095, as the unique Cyp family, was found only in this clade. This is closely followed by clade 10 with 82

Evolutionary Relationship
Phylogenetic analysis was carried out using the 372 aligned CYP proteins sequences to demonstrate the evolutionary relationships of the CYPs in the 13 Alternaria species, as illustrated in Figure 2. It was discovered that CYPs belonging to the same family, regardless of the Alternaria species, were clustered in the same monophyletic clade on the phylogenetic tree, suggesting a strong evolutionary relationship. The different CYPs in these organisms were discovered to be clustered into ten clades, as shown in Table 2. Clade I had the highest branches with 127 CYP proteins entries. Here, 15 CYP proteins with unidentified families were found to be clustered in this clade, and Cyp5095, as the unique Cyp family, was found only in this clade. This is closely followed by clade 10 with 82 phyletic branches having Cyp504, Cyp526, Cyp665 and Cyp5053 as the unique Cyp families found only in this clade. The least number of branching was found in clade VII and clade VIII having two and one branches, respectively; however, all the CYP proteins here had no family match in the FCPD, as shown in Table 2. Individual phylogenetic trees for each of the thirteen (13) species of Alternaria are presented in Supplementary File S5.
phyletic branches having Cyp504, Cyp526, Cyp665 and Cyp5053 as the unique Cyp families found only in this clade. The least number of branching was found in clade VII and clade VIII having two and one branches, respectively; however, all the CYP proteins here had no family match in the FCPD, as shown in Table 2. Individual phylogenetic trees for each of the thirteen (13) species of Alternaria are presented in Supplementary File S5.

Subcellular Location
Subcellular localization of the 372 cytochrome P450 in the 13 Alternaria species is presented in Figure 3. Here, it was discovered that most of the Cytochrome P450 proteins were localized in eight subcellular compartments, with the majority located in the endomembrane system (281). The nucleus has the least number of proteins, as only cytochrome P450 proteins of A. citriarbusti (1) were located in this organelle.

Subcellular Location
Subcellular localization of the 372 cytochrome P450 in the 13 Alternaria species is presented in Figure 3. Here, it was discovered that most of the Cytochrome P450 proteins were localized in eight subcellular compartments, with the majority located in the endomembrane system (281). The nucleus has the least number of proteins, as only cytochrome P450 proteins of A. citriarbusti (1) were located in this organelle.

Distribution of Secondary Metabolite-Related Gene
Secondary metabolite gene clusters of the 13 Alternaria species are as shown in Figure  4. Here, polyketide synthases (PKSs) had the highest occurrence (131), of which the highest was found in A. solani (14). This is followed by non-ribosomal peptide synthetases-like (NRPS-like) (91) with A. macrospora having the highest (9), non-ribosomal peptide synthetases (80) with this gene cluster occurring highest in A. porri while dimethylallyl diphosphate tryptophan synthases (DMATS), polyketide synthases-like (PKS-like) and terpene cyclases (T.C.) had 13, 27 and 34 occurrences. In contrast, hybrid had the least occurrence (11).

Discussion
Alternaria species are among the most commonly encountered fungi with the greatest global impact on humans and human activities. Many live as saprobes in the various habitats where they are involved in the degradation of a wide diversity of substances, such as leather, marine organisms, plants, wood pulp, sewage, paper, textiles, building supplies, stone monuments, optical instruments, cosmetics, computer disks and jet fuel [26]. CYPs in the Alternaria species are distributed into 71 families and 40 clans, which could be due

Discussion
Alternaria species are among the most commonly encountered fungi with the greatest global impact on humans and human activities. Many live as saprobes in the various habitats where they are involved in the degradation of a wide diversity of substances, such as leather, marine organisms, plants, wood pulp, sewage, paper, textiles, building supplies, stone monuments, optical instruments, cosmetics, computer disks and jet fuel [26]. CYPs in the Alternaria species are distributed into 71 families and 40 clans, which could be due to gene duplication events through evolution, enabling these organisms to survive and live in a wide range of habitats [27,28]. Ref. [14] reported that Cytochrome P450s are important proteins playing diverse biological roles in fungi's survival and physiological processes, as many have been identified to play a housekeeping role in fungi. It is for these reasons that other members of this genus are plant parasites where they serve as post-harvest pathogens destroying a large amount of agricultural output [1,7], while many others are known to cause human infections, particularly in immuno-compromised patients, causing dermatomycosis, respiratory tract infection, etc., while their spores have been identified as one of the most common and potent sources of both indoor and outdoor allergen [29].
The present study's findings revealed that some Cyp families were unique to some Alternaria species (A. macrospora, A. porri, A. solani A. gaisen, A. rosae and A.brassicicola). Rampersad (2020) opined that distinct cyps in various fungus species might significantly affect the host specificity of each fungus species to a given plant or animal. Alternaria species are well known producers of several host-specific toxins, such as AM-toxin, cercosporin, ABR-toxin, AC-toxin, dothistromin, AB-toxin, Ak-toxin, versicolorin B, Maculosin toxin, AFtoxins, AAL-toxin, AT-toxin, ACT-toxin and AL-toxin, ACR(L)-toxin, HC-toxin, Destuxin A, B, AS-toxin I and AP-toxin [19,20,30,31]), which directly influence their virulence and pathogenicity [21]. Cyp genes were found in all Alternaria species investigated. It was believed that the Cyp gene is conserved and plays an important function in Alternaria. However, the overall amount found in the examined fungus varies. The findings of our study revealed that five cyp families (Cyp552, Cyp5103, Cyp505, Cyp530 and Cyp65) were predicted to be conserved, as they were shown to be common across the majority of the queried Alternaria species, which suggests their significant role in this fungal genus. The four cyp families (Cyp552, Cyp505, Cyp530 and Cyp65) were previously reported in Aspergillus nidulans [32], Grosmannia clavigera [17], Mycosphaerella graminicola [16] and Trichoderma harzianum [12], while Cyp505 was reported in Phanerochete chrysosporium [15]. However, the Cyp5103 family predicted in 8 out of the 13 Alternaria spp. was not reported in any of the aforementioned fungal genera, which implies that this cyp family could serve as a vital target to be harnessed for their management or biosynthesis of important metabolites in these fungi. Cyp51 and Cyp61, reported to be common in both plant and animal species, were predicted in only 2 of the 13 queried Alternaria species. The spread and clustering of Cyp families into 10 phyletic clades across the 13 selected Alternaria species suggest several expansions and narrowing of cyp families along a paralogous evolutionary path, which could favor the development of several fungal traits to ensure the successful adaptation and colonization of their environment, including pathogenicity, as observed by [33,34]. Fungal cyp family expansions and functional diversifications have been linked to the development of fungal pathogenicity [29]. Despite some parallels in CYPome distribution amongst the Alternaria species, the family diversity of cyp genes varies significantly between the species. It is represented in their family number and in their family kind. We believe that the diversity in the cyp genes among the Alternaria species is connected to the potential need for novel physiological activities. The study revealed the different putative functions engaged by each phyletic clade of the 13 Alternaria spp. The beneficial roles played by p450s genes in various cell functions, including neutralization of host defense, metabolism of xenobiotics and primary and secondary metabolism, have been well documented [12]. The phylogeny of all annotated cyps produced revealed several cyp branches in the phylogenetic tree, demonstrating their significantly evolved divergence.
The findings of this study also established the localization of cyps genes in eight different cellular components (plasma membrane, cytoplasm, endomembrane system, organelle membrane, mitochondrion, mitochondrial membrane, extracellular space and nucleus) of Alternaria spp., which suggests their varying important roles in the species. Fungal species are known to possess the class II enzyme group, characterized to be involved in diverse cellular functions, such as biosynthesis of secondary metabolites (mycotoxins), lipid metabolism, sterols of membranes, detoxification of xenobiotics and phytoalexins, therefore, justifying the multicellular localization of their cyps [35,36].
Additionally, the role of P450 genes in the biosynthesis of secondary metabolites (S.M.), such as mycotoxins in fungal species, has been well documented [36]. Even though these S.M.s have not been shown to have a direct effect on the growth and development of fungi [37], they are significant for the colonization of their environment by serving as growth inhibitors of their competitors and chemical communicating signals [38][39][40]. The pathogenesis of several fungal pathogens is aided by the secondary metabolite they biosynthesized [41]. Alternaria species are notable producers of secondary metabolites, of which the majority are powerful mycotoxins known to be involved in cancer development. For this, over 300 known secondary metabolites are known to belong to either steroids, terpenoids, pyranones (pyrones), quinones and phenolics, identified to exhibit different biologic activities, such as phytotoxic, cytotoxic and antimicrobial properties, serving as base structures for pesticides and drugs, and many of these are biosynthesized by CYP proteins [4,14,32,42,43]. This is evident in this study's discovery of the preponderance of polyketide synthase (PKS) secondary metabolic-related gene cluster in all the Alternaria species studied, which is implicated in building the structural support of many secondary metabolites as they are multi-domain and multi-functional enzymes [41]. Other structural genes reported to aid in the synthesis of secondary metabolites include P450 monooxygenases, methyltransferases, reductases, dimethylallyl tryptophan synthase (DMATS), polyketide synthase-like (PKS-like), polyketide synthase (PKS), non-ribosomal peptide synthase (NRPS), non-ribosomal peptide synthase-like (NRPS-like) and terpene cyclases (T.C.) [40,[44][45][46] established the active role of non-ribosomal peptide synthase genes (NPS6, AbNPS2) to be involved in the biosynthesis of secondary metabolites, which directly promotes the integrity of the cell wall, viability of conidia, virulence of old spore and pathogenicity in A. brassicicola. Following the recent identification of 12 cercosporin toxin biosynthesis (CTB) genes from the whole-genome sequencing of A. alternata (Y784-BC03), a pathway for the biosynthesis of cercosporin was postulated, which was regulated by non-reducing polyketide synthase [21]. These studies indicate the significant importance of these secondary metabolites in fungi.

Conclusions
Our analysis has revealed the various Cytochrome P450 clans and families in the 13 Alternaria species and their distribution into the different phylogenetic groups, including their putative functions in metabolism. Here, 372 CYP proteins were identified to belong to 71 Cyp families, revealing the diverse biological, agricultural and biotechnological potential of these fungi. These fungi are a group of commonly encountered genera living in different habitats and utilizing a diverse substrate spectrum. The phylogenetic clustering of these proteins into ten clades reveals their close relationships and further demonstrates the influence of gene expansion and duplication during the evolutionary process. The majority of these proteins were identified to be located in the endo-membrane system, revealing intense participation of these proteins in active synthesis, packaging and transportation of substances in the cell. Their potential can be harnessed in bio-conversion and transformation of different compounds. The occurrence of secondary metabolite gene clusters is further evidence revealing the involvement of these genera in the synthesis of diverse secondary metabolites with agricultural, pharmaceutical, medical and industrial applications. This study will enable the selection of Alternaria species for various agricultural, medical and biotechnological applications, including their use to clean environmental pollution.